The impact of grain size on the efficiency of embedded SIMD image processing architectures

نویسندگان

  • Antonio Gentile
  • Sam Sander
  • Linda M. Wills
  • D. Scott Wills
چکیده

Pixel-per-processing element (PPE) ratio – the amount of image data directly mapped to each processing element, has a significant impact on the area and energy efficiency of embedded SIMD architectures for image processing applications. This paper quantitatively evaluates the impact of PPE ratio on system performance and efficiency for focal-plane SIMD image processing architectures by comparing throughput, area efficiency, and energy efficiency for a range of common application kernels using architectural and workload simulation. While the impact of grain size is affected by the mix of executed instructions within an application program, the most efficient PPE ratio often does not occur at PE grain size extremes (i.e., one pixel per processor or one processor per image). In this study, a set of four image processing application tasks is implemented on eight different SIMD configurations. Each configuration has a different PPE ratio and a different amount of local memory. Cycle accurate simulation and analytical technology modeling allows assessment of execution performance, area efficiency, and energy efficiency for each configuration. Results show the highest area and energy efficiency are achieved at PPE ratios between 16 and 256. Using these evaluation techniques (application grain size retargeting combined with area and energy technology modeling), a new class of efficient, embedded SIMD architectures for image processing can be designed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RC-SIMD: Reconfigurable communication SIMD architecture for image processing applications

During the last two decades, Single Instruction Multiple Data (SIMD) processors have become important architectures in embedded systems for image processing applications. The main reasons are their area and energy efficiency. Often the processing elements (PEs) of an SIMD processor are only locally connected. This may result in a communication bottleneck (only access to direct neighbors). One w...

متن کامل

A Novel Multiply-Accumulator Unit Bus Encoding Architecture for Image Processing Applications

In the CMOS circuit power dissipation is a major concern for VLSI functional units. With shrinking feature size, increased frequency and power dissipation on the data bus have become the most important factor compared to other parts of the functional units. One of the most important functional units in any processor is the Multiply-Accumulator unit (MAC). The current work focuses on the develop...

متن کامل

Grain Size Effect on the Hot Deformation Processing Map of AISI 304 Austenitic Stainless Steel

In this study, the hot deformation processing map of AISI 304 austenitic stainless steel in two initial grain sizes of 15 and 40 μm was investigated. For this purpose, cylindrical samples were used in the hot compression test at the temperature range of 950-1100 °C and the strain rate of 0.005-0.5% s-1. At first, the relationship between the peak stress and Zener-Hollomon parameter w...

متن کامل

Implementing and Evaluating Color-Aware Instruction Set for Low-Memory, Embedded Video Processing in Data Parallel Architectures

Future embedded imaging applications will be more demanding processing performance while requiring the same low cost and low energy consumption. This paper presents and evaluates a color-aware instruction set extension (CAX) for single instruction, multiple data (SIMD) processor arrays to meet the computational requirements and cost goals. CAX supports parallel operations on two-packed 16-bit (...

متن کامل

Evaluation of Two Real Time Low Level Image Processing Architecture

The paper presents a study on the impact of using SIMD (Singe Instruction Multiple Data) techniques and architectures in low level image processing. Speedups obtained on a SIMD parallel architecture (IMAPVISION board) and a single Intel MMX processor computer are presented for different low-level image processing operators. While the IMAP-Vision system performs better because of the large numbe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 64  شماره 

صفحات  -

تاریخ انتشار 2004